Locality Enhancement of Imperfectly Nested Loop Nests

نویسنده

  • Nawaaz Ahmed
چکیده

Most numerical applications using arrays require extensive program transformation in order to perform well on current machine architectures with deep memory hierarchies. These transformations ensure that an execution of the application exploits data-locality a n d uses the caches more eeectively. The problem of exploiting data-locality i s w ell understood only for a small class of applications { for programs in which all statements are present in the innermost loop of a loop-nest (called perfectly-nested loops). For such programs, statement instances can be mapped to an integer lattice (called the iteration space), and important transformations can be modelled as unimodular transformations of the iteration space. This framework has permitted the systematic application of transformations like loop-permutation, skewing and tiling in order to enhance locality in perfectly-nested loops. In dealing with programs that do not fall into this category, current compilers resort to ad-hoc techniques to nd the right sequence of transformations. For some important benchmarks, no technique is known that will discover the right sequence of transformations. In my thesis, I propose a technique that extends the framework for perfectly-nested loops to general programs. The key idea is to embedthe iteration space of every statement in the program into a special iteration space called the product space. The product space can be viewed as a perfectly-nested loop nest, so this embedding generalizes techniques like code sinking and loop fusion that are used in ad hoc ways in current compilers to produce perfectly-nested loops from imperfectly-nested ones. In contrast to these ad hoc techniques however, embeddings are chosen carefully to enhance locality. The product space is then transformed further using unimodular transformations, after which fully permutable loops are tiled, and code is generated. Code can also be generated to emulate block-recursive versions of the original program. I demonstrate the eeectiveness of this approach for dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the SPECfp95 benchmark suite. iii To Ithaca This Ithaca has done for me-set me out upon my way. It cannot then seem too lean : my journeys start here, in this calm destination. I've yet to face irate Poseidon and battle the Cyclops. These demons I bear in my soul, and my soul will surely raise them up in front of me. I pray though the course be long I am touched by fine sentiment and lofty thinking so when old and I moor …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Matrix-Based Approach to Global Locality Optimization

Global locality optimization is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout transformations. Pure loop transformations are restricted by data dependences and may not be very successful in optimizing imperfectly nested loops and explicitly parallelized programs. Although pure data transformations are not constrained by...

متن کامل

Tools for Performance Optimizations and Tuning of Affine Loop Nests

Multicore processors have become mainstream and the number of cores in a chip will continue to increase every year. Programming these architectures to effectively exploit their very high computation power is a non trivial task. First, an application program needs to be explicitly restructured using a set of code transformation techniques to optimize for specific architectural features, especial...

متن کامل

Affine-by-Statement Transformations of Imperfectly Nested Loops

A majority of loop restructuring techniques developed so far assume that loops are perfectly nested. The unimodular approach unifies three individual transformations – loop interchange, skewing and reversal – but is still limited to perfect loop nests. This paper outlines a framework that enables the use of unimodular transformations to restructure imperfect loop nests. The concepts previously ...

متن کامل

Oil and Water can mix! Experiences with integrating Polyhedral and AST-based Transformations

The polyhedral model is an algebraic framework for affine program representations and transformations for enhancing locality and parallelism. Compared with traditional AST-based transformation frameworks, the polyhedral model can easily handle imperfectly nested loops and complex data dependences within and across loop nests in a unified framework. On the other hand, AST-based transformation fr...

متن کامل

Affine Transformations for Communication Minimized Parallelization and Locality Optimization of Arbitrarily Nested Loop Sequences

A long running program often spends most of its time in nested loops. The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses for parallel execution. Affine transformations in this model capture a complex sequence of execution-reordering loop transformations that improve performance by parallelization as well as better locality. Although a significant am...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000